Skip to content

try-election-with-applied#35283

Open
cadem wants to merge 1 commit into
3.0from
dmchen/try-election-with-applied
Open

try-election-with-applied#35283
cadem wants to merge 1 commit into
3.0from
dmchen/try-election-with-applied

Conversation

@cadem
Copy link
Copy Markdown
Contributor

@cadem cadem commented May 6, 2026

Description

Issue(s)

  • Close/close/Fix/fix/Resolve/resolve: Issue Link

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

Copilot AI review requested due to automatic review settings May 6, 2026 10:25
@cadem cadem requested review from a team, dapan1121, guanshengliang and zitsen as code owners May 6, 2026 10:25
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the Raft leader election mechanism by prioritizing candidates with higher applied progress. Key changes include extending the SyncRequestVote structure to carry the candidateAppliedIndex, implementing comparison logic during vote evaluation, and adding extensive test coverage across unit and cluster-level tests. Feedback highlights a critical concern regarding the breaking change to the RPC protocol, which may hinder rolling upgrades. Additionally, the reviewer warned that enforcing appliedIndex as a mandatory voting condition could risk election deadlocks and suggested treating it as a preference. There is also a note about reduced debuggability in the test library due to simplified JSON serialization.

SyncTerm term;
SyncIndex lastLogIndex;
SyncTerm lastLogTerm;
SyncIndex candidateAppliedIndex;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Adding candidateAppliedIndex to the SyncRequestVote struct changes its memory layout and size. This is a breaking change for the RPC protocol and will prevent rolling upgrades in a distributed cluster. Consider using a versioned message format or appending new fields in a way that maintains backward compatibility.

Comment on lines +140 to +141
bool grant = (pMsg->term == currentTerm) && logOK && appliedOK &&
((!hasVoted) || syncUtilSameId(&ths->raftStore.voteFor, &pMsg->srcId));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Making appliedOK a mandatory condition for granting a vote (&& appliedOK) poses a risk to cluster availability. If a node with the most up-to-date log is lagging in its FSM application (lower appliedIndex), it will be rejected by nodes that have applied more, even if those nodes have stale logs. This can lead to election deadlocks. To preserve Raft liveness, this should be a preference (e.g., via randomized delays) rather than a hard requirement for voting.

Comment on lines +93 to +94
snprintf(ptrbuf, sizeof(ptrbuf), "%p", pCache);
cJSON_AddStringToObject(pRoot, "opaqueCache", ptrbuf);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The detailed JSON serialization of the Raft cache has been replaced with a simple pointer address. This significantly degrades debuggability during testing. If the previous implementation was problematic, it should be fixed rather than removed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements an applied-index-aware Raft leader election preference in the sync subsystem by extending the SyncRequestVote contract to carry candidateAppliedIndex, using that value in vote-grant decisions, and adding unit/cluster-style regressions plus supporting debug/docs updates.

Changes:

  • Extend SyncRequestVote with candidateAppliedIndex and populate/log it during RequestVote send/receive.
  • Add vote-grant logic that rejects candidates whose applied index is behind the local node (with fallback when applied index is unavailable).
  • Add/adjust sync unit tests (RequestVote serialization/decision, election preference, deterministic tie reset) and a pytest cluster regression; add design/plan docs.

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tests/pytest/cluster/syncingTest.py Adds timing + lagging-node regression scenario for replica-3 clusters
tests/pytest/cluster/clusterSetup.py Adds node1/node2/node3 convenience fields derived from tdnodes
TD-xxx-try-election-with-applied Adds a one-line instruction/placeholder text file
source/libs/sync/test/syncVotesGrantedTest.cpp Updates syncNodeOpen call signature usage and adds deterministic tie-reset regression
source/libs/sync/test/syncRequestVoteTest.cpp Adds applied-index round-trip + applied-index decision tests
source/libs/sync/test/syncElectTest.cpp Adds a local in-process election harness and validates higher-applied node wins
source/libs/sync/test/sync_test_lib/src/syncSnapshotDebug.c Updates snapshot debug JSON to new sender/receiver fields
source/libs/sync/test/sync_test_lib/src/syncRaftEntryDebug.c Makes raft cache debug output opaque (pointer-only)
source/libs/sync/test/sync_test_lib/src/syncMessageDebug.c Adds message-type compatibility macros + RequestVote builder with applied index + JSON field
source/libs/sync/test/sync_test_lib/src/syncMainDebug.c Fixes build by casting for raftStoreGetTerm in const debug path
source/libs/sync/test/sync_test_lib/src/syncIO.c Updates test IO helpers to new queue/Qall allocation APIs and version string usage
source/libs/sync/test/sync_test_lib/inc/syncTest.h Declares new RequestVote helper; makes raft cache types opaque for tests
source/libs/sync/src/syncUtil.c Extends RequestVote send/recv logging to include applied index
source/libs/sync/src/syncRequestVote.c Adds applied-index gate into vote granting
source/libs/sync/src/syncMessage.c Initializes candidateAppliedIndex on RequestVote build
source/libs/sync/src/syncMain.c Adds syncNodeGetAppliedIndex() accessor via FSM callback
source/libs/sync/src/syncElection.c Populates candidateAppliedIndex into outgoing RequestVote messages and uses unified logging
source/libs/sync/inc/syncUtil.h Notes RequestVote logs now include applied-progress context
source/libs/sync/inc/syncRequestVote.h Exposes syncNodeOnRequestVoteAppliedIndexOK()
source/libs/sync/inc/syncMessage.h Adds candidateAppliedIndex to SyncRequestVote struct
source/libs/sync/inc/syncInt.h Declares syncNodeGetAppliedIndex()
docs/001-raft-appliedindex-election/tasks.md Adds task breakdown and execution order for the feature
docs/001-raft-appliedindex-election/spec.md Adds feature spec and acceptance scenarios
docs/001-raft-appliedindex-election/research.md Documents design decisions and non-goals
docs/001-raft-appliedindex-election/quickstart.md Adds build/run/validation steps
docs/001-raft-appliedindex-election/plan.md Adds implementation plan and repo structure notes
docs/001-raft-appliedindex-election/data-model.md Defines message/data entities and validation rules
docs/001-raft-appliedindex-election/contracts/request-vote-applied-index.md Specifies the RequestVote applied-index preference contract
docs/001-raft-appliedindex-election/checklists/requirements.md Adds requirements completeness checklist for the spec

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -41,7 +74,15 @@ def run(self):

tdSql.execute("alter table meters add col col5 int")
tdSql.execute("alter table meters drop col col5 int")
Comment on lines +37 to +44
deadline = time.time() + 30
replica_ready = False
while time.time() < deadline:
tdSql.query("show %s.vgroups" % db_name)
if tdSql.queryRows > 0:
replica_ready = True
break
time.sleep(1)
Comment on lines +93 to +97
sNInfo(ths,
"appliedok:1, fallback due to unavailable applied index, {my-applied:%" PRId64 ", recv-applied:%" PRId64
", recv-term:%" PRIu64 "}",
*pLocalApplied, pMsg->candidateAppliedIndex, pMsg->term);
return true;
Comment on lines +101 to +105
sNWarn(ths,
"appliedok:0, candidate applied index behind local state, {my-applied:%" PRId64 ", recv-applied:%" PRId64
", recv-term:%" PRIu64 "}",
*pLocalApplied, pMsg->candidateAppliedIndex, pMsg->term);
return false;
Comment on lines +108 to +112
sNInfo(ths,
"appliedok:1, candidate applied index acceptable, {my-applied:%" PRId64 ", recv-applied:%" PRId64
", recv-term:%" PRIu64 "}",
*pLocalApplied, pMsg->candidateAppliedIndex, pMsg->term);
return true;
Comment on lines 45 to +48
typedef struct SyncPing SyncPing;
typedef struct SyncPingReply SyncPingReply;
typedef void SRaftEntryHashCache;
typedef void SRaftEntryCache;
Comment on lines 199 to 203
", sby:%d, "
"r-num:%d, "
"lcfg:%" PRId64 ", chging:%d, rsto:%d",
pSyncNode->vgId, syncStr(pSyncNode->state), raftStoreGetTerm(pSyncNode), pSyncNode->commitIndex,
pSyncNode->vgId, syncStr(pSyncNode->state), raftStoreGetTerm((SSyncNode*)pSyncNode), pSyncNode->commitIndex,
logBeginIndex, logLastIndex, snapshot.lastApplyIndex, pSyncNode->raftCfg.isStandBy, pSyncNode->replicaNum,
Comment on lines +13 to +18
Update the sync runtime in these areas:

- Add the applied-index field to `SyncRequestVote` in `community/source/libs/sync/inc/syncMessage.h`.
- Populate the field in `community/source/libs/sync/src/syncElection.c` when building vote requests.
- Read local applied progress from `FpAppliedIndexCb` in `community/source/libs/sync/src/syncRequestVote.c` and incorporate it after the existing log-recency gate.
- Extend request-vote logging in `community/source/libs/sync/src/syncUtil.c` or adjacent sync logging paths so operators can see the comparison inputs.
@@ -0,0 +1 @@
改进目前raft的选举实现,加入对appliedIndex的判断,appliedIndex大的更优先被选举为leader。不要替我创建git分支。 No newline at end of file
uint16_t gPorts[] = {7010, 7110, 7210, 7310, 7410};
const char* gDir = "./syncElectTest";
int32_t gVgId = 1234;
static SyncIndex electAppliedIndexCb(const SSyncFSM* pFsm) { return ((const ElectTestFsm*)pFsm)->appliedIndex; }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants